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Abstract. We review recent work on the statistical mechanics of Von 
Neumann's growth model and discuss its application to cellular metabolic 
networks. In this context, we present a detailed analysis of the physio- 
logical scenario underlying optimality a la Von Neumann in the metabolism 
of the bacterium E. coli, showing that optimal solutions are character- 
ized by a considerable microscopic flexibility accompanied by a robust 
emergent picture for the key physiological functions. This suggests that 
the ideas behind optimal economic growth in Von Neumann's model 
can be helpful in uncovering functional organization principles of cell 
energetics. 

1 Introduction 

In the late 1930's John Von Neumann published (originally in German) a simple linear 
model to describe optimal economic growth I1I2| . Quite generally, he conceived a sys- 
tem in which TV technologies can operate by combining M commodities. Technologies 
are distinguished by the amounts of goods they use as inputs and return as outputs. 
Moreover, they are linear (or constant returns to scale in economic jargon), in the 
sense that doubling the amount of inputs allows to double that of outputs (which 
implies that, for instance, saturation effects are neglected). Crucially, the system is 
required to be self-sustained, so that all goods necessary for production at each period 
must have been produced at the previous period. Given the input/output specifics of 
production processes, one poses the question of establishing which processes would 
be operated and which would instead be kept inactive if the system has to maximize 
the net production rate of commodities. It has been noted that Von Neumann, who 
had been trained in chemistry and chemical engineering, might have been inspired 
by chemical reactors for such a construction [3 . In any case, his growth model has 
become a cornerstone of the theory of economic growth |4l5j . 

Von Neumann's general idea can easily be transported to other contexts outside 
economics, where input/output processes of various kinds are wired together to form a 
heterogeneous network [61718] . A very notable example is found in cells. Metabolism, 
namely the complex web of chemical reactions underlying energy transduction in 
living organisms |9I10I11| . becomes an ideal candidate to test Von Neumann's ideas 
once the cross-membrane flow of matter is included in the above scheme. Given the 
fundamental nature of Von Neumann's assumptions, it is natural to ask to which 
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degree the chemical activity of real cells can actually be described by an optimal 
growth (in the sense discussed above) framework. 

On the other hand, the analysis of optimal growth in linear input/output systems 
is appealing for statistical physicists as well. As the system size increases, the opti- 
mal growth properties of 'random' input/output networks should hopefully become 
independent of the particular input-output relationships. Rather, it should be pos- 
sible to highlight a set of structural characteristics or constraints that are sufficient 
to frame optimal growth factors and activity profiles. Random input-output systems 
then constitute a key benchmark against which one can evaluate the performances of 
single real instances, and the theory of spin glasses and neural networks provides the 
technical and conceptual tools for this type of analysis [T^]- Indeed, their effective- 
ness to deal with linear problems close in spirit to Von Neumann's setup has been 
demonstrated in various instances in the past (see e.g. the so-called knapsack problem 
[13114] or even, more recently, the problem of general economic equilibrium [15] V 

Our goal here is, on one hand, to look back on the recent work concerned with 
the statistical mechanics of Von Neumann's growth model for random input-output 
systems. These studies have revealed a universal behaviour of the relevant quantities 
and allowed to highlight the key ingredients that can bring the basic framework closer 
to biologically important questions. Furthermore, we wish to discuss its application 
to cellular metabolism, which began more recently and is an ongoing and expanding 
challenge. To complement previous studies, we shall present a broad analysis of the 
physiological scenario underpinned by Von Neumann's model applied to the cellular 
metabolism of the bacterium Escherichia coli. In our view, such studies demonstrate 
that the basic postulate of optimal growth a la Von Neumann might provide a useful 
key to understand the organization of cell metabolism and energetics. 

2 Von Neumann's model 

The basic setup of Von Neumann's growth model is as follows. One considers a system 
of N distinct processes (technologies, reactions) that are capable of operating on M 
compounds (commodities, chemical species). The system is assumed to be closed, i.e. 
there is no external supply of compounds to the system. In other terms, compounds 
can only be produced from each other. Each process i is specified by a vector b; = 
{6f }, representing for each p, = 1, . . . , M the amount of compound p process i uses as 
input (the substrate), and by a vector — {a^ 1 }, representing for each p = 1, . . . ,M 
the amount of compound p process i outputs (the product). In the simplest case, it 
is assumed that af,6f > for each i and p. For sakes of definiteness, each process 
should have at least one input, whereas each compound should be the output of at 
least one process. In turn, the input and output vectors form the columns of the 
M X N input and output matrices, which we shall denote as B and A, respectively. 
Processes are assumed to operate linearly, i.e. when run at scale Sj > they use 
amounts Sibi of inputs to produce s,a, outputs. In this setup, the production side 
(its dual will be discussed later on) of Von Neumann's growth problem (also known 
as the 'technological expansion problem') is formulated in the following terms: 



In other words, one wants to find the largest p > (which we shall denote as p*) such 
that the system of inequalities 



max ( p ) 

p>0 



subject to (A — pB)s > 



(1) 



N 




(2) 



»=i 
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admits a solution s = {s^} with Sj > for all i (with the trivial solution = being 
ruled out). The physical meaning of the above conditions becomes clear by noting 
that the M-vector Bs (resp. As) encodes the total amount of each compound p, used 
as input (resp. returned as output) by the various technologies. Hence for fixed p, a 
vector s satisfying ^ guarantees that for each compound the total output is at least 
p times the total input. Maximizing p then amounts to finding the largest growth fac- 
tor sustainable by the system. Note that the growth factor is defined uniformly over 
compounds, and one does not try to find an optimal vector of growth factors, one for 
each compound. In such a setting (which would make for an interesting generaliza- 
tion) , the p* defined above would characterize the compounds forming the production 
bottlenecks, namely those with the smallest feasible growth factor. 

In a discrete-time dynamical setup |16j . one can imagine a scenario in which an 
operation scale is chosen for every technology at each time step t. Then the input 
vector I(t) = Bs(t) generates an output vector 0(t) = As(i), part of which is recycled 
into production (i.e. as the input at the next time step). The difference, namely 



is available e.g. for consumption or for other uses outside the production system 
strictly defined, including as waste. This condition ensures to be focusing on self- 
sustaining production patterns. A minimal constraint to be imposed for stability 
is that C(t) > 0, otherwise an external source of compounds would be needed to 
overcome the input shortages at time step t + 1. In economics, a central issue at this 
point is that of maximizing the discounted utility of the overall production over the 
possible time evolutions. A class of results known as 'turnpike theorems' [T7] show 
that optimal paths essentially coincide with the paths of maximal expansion described 
by Von Neumann's model. Therefore one may restrict to trajectories {s(i)} ensuring 
that I(t + 1) = pl(t) with p > (i.e. such that I^(t) = I^(Q)p r ). It is easily seen that 
such paths correspond, for operation scales, to s(t + 1) = ps(t), implying 



The stability requirement C(t) > for all t immediately leads to consider the problem 
([2]) and, in turn, Q. It is now clear that if the solution p* of ([lj is larger than one 
the system is expanding (i.e. trajectories are stationary states of constant growth at 
a growth factor at least equal to p*)\ if p* < 1 the system is instead contracting. 

Note that a linear combination As+A's' of two solutions s and s' with non negative 
coefficients A > and A' > always satisfies the constraints, i.e. the solution space of 
([2]) is convex. As p increases (clearly, all vectors s are solutions for p = 0) one expects 
its volume to shrink. As we shall see, a key question concerns the size of the solution 
space that survives at p* . 

A classical and easy to prove result concerns the existence of p* [I]. Surely, if p 
is sufficiently small the system ^ is bound to admit a solution, i.e. it is possible to 
find a p > for which As > pBs. On the other hand, it is also possible to choose 
p > so large that the sum of the row entries of the matrix A — pB is negative for 
each row, which would imply the absence of a solution of p* then coincides with 
the least upper bound of the set of p's for which ([2| admits a solution. Establishing 
the value of p* is certainly the first task to be carried out. An equally important 
challenge is however that of profiling the way in which processes are employed in 
states of optimal growth. Because not all processes need to be operated, one expects 
that some of them will actually be employed, whereas others will be shut down being 
unnecessary. Finally we shall focus on the characterization of the set of compounds 
that will become available for consumption (if any). 



C(t) = 0(*) - I(* + 1) 



(3) 



C(t)=p'(A-pB)s(0) 



(4) 
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3 Random input-output networks 

The rationale for studying random input-output networks is simply that the macro- 
scopic behaviour of a system of many interacting entities (technologies in Von Neu- 
mann's case) can be understood at the statistical level when the number of degrees of 
freedom diverges. In this limit, in fact, the key macroscopic observables are expected 
to become substantially independent of the specific realization of the couplings [T8] , 
Large systems with random couplings are therefore expected to be good approxima- 
tions of non-random ones. In this sense, they also constitute a fundamental bench- 
mark against which real systems should be compared. Luckily, a host of analytical 
and conceptual tools are available to deal with the statistical mechanics of systems 
with random interactions of the type relevant for Von Neumann's problem |12ll4j . 



3.1 Fully connected networks 



The fully connected Von Neumann model is defined by [16] 

a? =5(1 + a?) 



(5) 
(6) 



where a and b are constants (both > 0) whereas and (3^ are independent, identically 

2 

distributed quenched Gaussian random variables of mean zero and variances (af ) 

and ) 2 , respectively. In brief, processes in such a system employ every compound 
in a finite amount to produce every compound in a finite amount. The topology 
underlying this problem is hence that of a fully connected bipartite graph. Inserting 
the above definitions into ([2| one arrives at the conditions 



N 



N 

(3 - pb) Si + s * M ~ Mi) ^ V ^ 



i=l 



(7) 



Because Sj > 0, for N 3> 1 the first term dominates over the second term, which 
scales like y/N (assuming no correlations between Sj's and the Gaussian disorder). 
Therefore, to leading order in N, 

P* = a/b (8) 

This tells us that the leading part of p* is independent of the details of the in- 
put/output vectors. Rather, it is determined only by the average input and output 
coefficients. For sakes of simplicity, we shall henceforth set a = b = 1. To account for 
subleading corrections, it is convenient to set 



1 



9 
N 



(9) 



so that, for N ^> 1, p e g l^ and g can be interpreted as a growth rate. In turn, 
^ yields the conditions 



N 



Ct- — 














1 Vn_ 



> V^t 



(10) 



One is interested in studying the largest g for which ( 10 ) has solutions. The calcula- 
tion, detailed in [TB] follows standard steps of spin-glass-like systems (specifically, it 
is similar to a Gardner calculus 19 ) and we limit ourselves to a brief sketch here. 
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The solution space volume at fixed disorder and fixed g can be expressed as 

V{g) = T, s X\6{c»)5[Y J s l ~A (H) 

where a hard constraint X)^=i s i = N has been included so as to remove the linear 
degeneracy inherent in Von Neumann's problem (this is equivalent to setting a scale 
for the s^s). The typical solution space volume Vt yp (g) ~ exp(Nv(g)) can be obtained 
from 

v{g) = lim ^bg~^) (12) 



where the over-bar denotes an average over the quenched disorder (5[6). The evalu- 



ation of the disorder average can be carried out resorting to the replica trick. The 
relevant thermodynamic limit in this case is that where both N (number of processes) 
and M (number of compounds) diverge with a fixed ratio n = N/M . Here, after av- 
erages are computed, one can obtain an exact expression for v(g) via a saddle-point 
integration and a replica-symmetric Ansatz. An interesting outcome of the above cal- 
culation is that p* (or more precisely g*) depends on the disorder statistics only via 
the parameter 

k = W Z7 WY 2 (13) 

Specifically, one obtains that g* /Vk is a function of n only. This implies that, gener- 
ically, larger growth rates require larger spread in the input/output coefficients, sug- 
gesting that optimization (and thus process selection) can be operated by choosing 
maximally diversified technologies. It turns out that for g — > g* v(g) can be ex- 
pressed in terms of the average Euclidean distance between solutions of ([2| (denoted 
as x = x(<?))- I n particular, one finds 

via) * x lN (14) 

where 7 is a positive exponent. Optimal growth (i.e. g — > g*) can be retrieved in 
the limit x — > 0, in which case v(g) reduces to a single point (i.e. there is a a single 
configuration of operation scales that corresponds to an optimally growing system) . 
The resulting (n, g) phase diagram is shown in Fig. 1. In brief, g* < for n < 1, 
implying that all viable configurations of operation scales are contracting. When 
n > 1, instead, expanding states become viable. The growth rate generically increases 
with n. However the rate of increase changes drastically between the phase where 
optimal states are contracting to that where expansion is possible. Indeed one gets 

5 _l~ 9 _ (15) 



which suggests that an increase in N (i.e. in the number of available technologies) 
can benefit growth for sufficiently small n. By contrast, when n is large technological 
innovation appears to bear a much smaller effect on the growth properties. The saddle 
point conditions also allow to derive a simple algebraic equation valid at g* that relates 
the fraction ip of inactive processes (i.e. processes such that Sj = 0) to the fraction <j) 
of compounds /1 such that c M = (i.e. for which the overall output exactly matches 
the overall input): 

tpo = 1 - <f>o/n (16) 



In different terms, ( 16 1 simply states that for g = g* (when a single vector of operation 
scales survives satisfying (2|) the number of variables Sj ^ (namely iV(l — -0 O )) 
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Fig. 1. Phase structure of the random fully connected Von Neumann model. Values of n 
and g in the shaded region are viable, i.e. there exist configurations of Si's satisfying pi). 
The continuous line is g* /y/kn. For n < 1 expanding states are unfeasible. Only for n > 1 
can one obtain a feasible expanding state [16] . 



should equal the number of equalities (namely M(f> ). Both ip and ^>o increase with 
n in a roughly sigmoidal way, with ipo( n = 1) = 4>o{ n = 1) — 1/2- In words, optimal 
states in the contracting regime (n < 1) are characterized by having almost all 
processes operated and almost all compounds available for consumption. In optimal 
expanding states, instead (n 3> 1), a small fraction of processes is profitable while 
almost all compounds are perfectly recycled with limited production of consumable 
goods or waste. 

3.2 Finitely connected networks 

A mathematically more compelling scenario is obtained when the topology underlying 
the production networks is characterized by finite connectivity, i.e. when each process 
uses a finite number of compounds as substrates to produce a finite number of other 
(different) compounds 20 . This more realistic case can be conveniently dealt with by 
the cavity method under the assumption that the bipartite graph that underlies the 
production network has a locally tree-like structure. In essence, cavity theories exploit 
the fact that on such graphs the removal of a node of one type (either a compound or 
a reaction) makes the nodes of the other types (reactions or compounds) that were 
linked to the node we removed roughly statistically independent, since short loops 
causing feedbacks and correlations are assumed to be absent (or anyway negligible). 
So if we denote by the scale variables of processes connected to a compound [i 
and by c M<El the variables defined in ^ for compounds connected to a process i, then 
their probability distributions are such that 



unless compound fi or, respectively, process i are removed, in which case one has 



pisi^^Hmisi) and qid**) 



(17) 




(18) 
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where the subscripts (i) and (p) indicate the removal of process node i and of com- 
pound fi, respectively. Note however that g^(c M ) can be written explicitly as 



(19) 



where j E fi\i indicates that the operations extend over all processes connected to p 
except i. Similarly, introducing the fictitious energy function 



E 



M 



one has 



pf\si) (x Tr c vei\„ exp 



(20) 



n ^)(^) ( 2i ) 



(ultimately, the limit of large j3 must be considered). In turn, knowledge of the so- 
called cavity distributions (19 1 and (21 1 allows to reconstruct the entire probability 
distribution since 



Pi(si) oc Tr cl .€<g(t)(c ,/e *)e3cp 



(22) 



(23) 



J 6 A" 



Equations (19 1 and (21) can be solved self-consistently by population dynamics. The 
numerical procedure to extract p* from this setup is non-trivial and we refer the 
reader to [20] for details. The lesson one learns from this analysis is that the picture 
obtained in the fully connected case is rather robust to dilution. Finite connectivity 
does however bear a strong quantitative impact. In the contracting regime (p* < 1) 
optimal growth rates generically decrease with increasing dilution while the opposite 
happens in expanding regimes with p* > 1. Moreover, optimal growth rates turn out 
to be consistently larger when processes have heterogeneous (as opposed to regular) 
connectivities, independently of the degree distribution of compounds. Hence, gener- 
ically dilution (in expanding regimes) and stochastic connectivities allow to achieve 
larger growth rates with respect to the case of fully connected input/output systems. 
The phase structure shown in Fig. 1 is however qualitatively preserved. 



3.3 Reversible networks 

An important generalization considers processes as reversible, i.e. such that the roles 
of input and output coefficients can be interchanged [21] . This simple extra ingre- 
dient generates a surprisingly rich and considerably more complex scenario. Let us 
first notice that by allowing processes to revert Von Neumann's problem becomes 
intrinsically non-linear. Indeed it can be written compactly upon defining a matrix 
R with entries 

R n_\ a t-P h t fors l >0 

Ul m-pat for Si <0 [ > 
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whereby one sees that R = R(p, s). (The fact that process can be reverted is reflected 
also in the possibility that s$ becomes negative.) Under this condition, ([I]) now reads 

max p subject to R(p, s)s > (25) 

p>0 

There are two basic properties of the reversible Von Neumann problem that are both 
easy to prove and rich of consequences. The first is that in general reversing all 
processes is not equivalent to time reversal, i.e. the naive expectation that if a scale 
vector s guarantees a growth factor p then the reversed vector s' should guarantee a 
growth factor p' = 1/p turns out to be false. To see this, define 

N 



S(s)=$>(tf-/*n (26) 

i=l 
N 

?(s)=5>(&?-A?) (27) 



and note that 



A' 



c£(s)+^( S ) = (l- W ')E s *< (28) 



i=i 



Clearly, (s) = c£ (s) = implies p' — 1/p but if c(f (s) > then 

PP' < 1 - (29) 

which tells us that c^(s) = for all p is a necessary but not sufficient condition for 
p' to be equal to 1/p. In general, 

"M 1 -^*) <30) 

The second notable fact is that the solution space ceases to be convex, in particular 
for p < 1. Re- writing ([2| in this case as 

N 
i=l 

one easily sees that 

c^(As + A's') = Ac^(s) + A'c"(s') + AM ( S > s ') ( 32 ) 

with 

AT 

A"(s, s') = (1 - p) + 6f + AV^-aJ) - (As, + A's^C-A* - VsJ)] 

»=i 

. (33) 

If s and s' are both solutions, then a linear combination will again be a solution if 
> 0. Sis'j > indeed implies — for each p. However for SjS^ < one finds 
that yl^ > only for p > 1. In other words convexity is guaranteed only for p > 1 
and may not occur even at p* when p* < 1. Note however that the solution space 
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Fig. 2. Phase structure of the random fully connected reversible Von Neumann model. The 
continuous (resp. dashed and dot-dashed) line is g* / \Jkn(l + (j>) for <j> = (resp. = 1/2 
and <j> i = 1). Markers correspond to optimal growth rates estimated numerically as explained 
in [5T]. (For clarity, the shaded region is only shown for ij> — 0.) For n(l + <f>) < 1 expanding 
states are unfeasible. However one sees slight disagreements between the predicted (replica- 
symmetric) value of g* and computed one. 



is obviously convex for p = 0, showing that p < 1 is a necessary but not sufficient 
condition for the solution space to be non-convex (e.g. fragmented in disjoint parts). 
It is very important to remark that non-convexity poses a serious problem for the 
design of efficient algorithms to sample the solution space in the contracting regime 

The statistical mechanics analysis of this case in a fully connected topology can 
proceed along steps similar to those performed for the random fully connected net- 



work, with (111 replaced by 

/ n 



7( S )='&,IIW EW-JV (34) 



If 4> denotes the fraction of reversible processes (i.e. the effective number of processes 
that can be operated in the system is N+N<fi) , one finds an expanding phase with g* > 
for n > n c = 1/(1 + <j>) (see Figure 2). This suggests, as expected, that being able 
to run processes in both directions extends the range of the regime where expanding 
solutions exist. The price to be paid is that optimal growth rates in the expanding 
regime are comparatively smaller when processes are reversible. For n(l + </>) < 1, 
instead, the system is confined to a contracting regime. Here, larger tf>'s imply larger 
growth rates, meaning that increasing the availability of reversible processes benefits 
the system's growth properties. As seen above, however, the contracting phase is 
characterized by lack of convexity. This is hinted by the fact that the computed and 
predicted (replica-symmetric) values of g* disagree (albeit just slightly) for n < n c 
when (j) ^ 0. To see this more explicitly one can measure directly the probability 
that the uniform linear combination of two solutions solves Von Neumann's problem 
(Figure 3). Inspecting results, one sees immediately that the solution space displays 
marks of convexity for sufficiently small g < g* (recall that the solution space is 
certainly convex for p = 0). In the intermediate range of values of g up to g* , instead, 
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Fig. 3. Probability that the linear combination of two solutions with coefficients 1/2 solves 
the fully reversible (0 = 1) random fully connected von Neumann problem for n = 1/2. The 
solution space appears to lose convex values of g < g* — 0. Different curves correspond to 
different values of N increasing in the direction of the arrow from 50 to 500. 



convexity appears to break down. The replica-symmetric theory one develops for this 
case is thus expected to fail in a range of values of g < (as seen in Fig 2). 

As before, an increase in the optimal growth rate causes a decrease in the number 
of compounds with d 1 > 0. This corresponds to a 'waste' reduction and, on the 
other hand, to an impoverishment in the global output profile. Quite strikingly, the 
reversible part of the process set displays a neat second-order transition as n exceeds 
n c : all reversible processes are employed in the contracting phase, where a finite 
fraction of them is inactive when g* > {2Tj . 



4 Towards metabolic networks 

Von Neumann's problem finds a natural ground of application in chemical reaction 
networks, in which the input-output relations linking technologies to goods in pro- 
duction systems are replaced by the wiring between reactions and chemical species 
governed by stoichiometry. Perhaps most importantly, in this context living cells 
present us the opportunity to test Von Neumann's ideas directly in a real system, 
namely their energy metabolism. 

In cells, an enormous number of genetic, regulatory and biochemical mechanisms 
interact on different spatial and temporal scales to accomplish a wide range of tasks, 
from growth and reproduction to motility, homeostasis and exterior sensing. The 
rapid development of high-throughput genome sequencing techniques together with 
the availability of refined gene annotations has allowed, over the past few years, to set 
up consistent, controlled protocols for reconstructing the cellular metabolic networks 
of different organisms to an unprecedented degree of detail |9l22j . Such networks 
underlie the basic energy harvesting processes in cells, by which the energy derived 
from nutrients is transduced into usable forms of mechanical or chemical energy. The 
output of metabolism is constituted essentially by the so-called building blocks for 
functionally relevant biological macromolecules (amino-acids, fatty acids, nucleotides) 
as well as by waste products like e.g. carbon dioxide and acetate (mostly for bacteria). 
Building blocks are then used in a series of processes downstream of metabolism to 
form functional proteins (including reaction-catalying enzymes) , membrane structures 
and nucleic acids. To a large degree, the physiological outcome of this can be read off 
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from the cell's proteome, i.e. from the composition of the repertoire of proteins that 
the organism was able to produce, which in turn effects the regulation of metabolism. 

It is immediately clear that, once the structure of the network (encoded in the 
reactions' stoichiometry) is known, Von Neumann's stability criterion can serve as a 
minimal model for the organization of metabolism. However, before discussing this 
application more in detail (Sec. 5), it is worth trying to bring the theory a few steps 
forward (in a direction relevant to biology) by addressing a couple of peculiar aspects 
of biochemical reaction networks that, on one hand, have a direct influence on the 
solutions of Von Neumann's problem and, on the other, generalize it. These are, 
respectively, (a) the emergence of the so-called 'conserved pools of reagents', which 
translate into the need to consider constraints on the quenched disorder (the input- 
output coefficients) , and (b) the highly important issue of thermodynamic stability. 



4.1 Stoichiometric quenched disorder 

In first place, addressing optimality in reaction networks requires a more careful eval- 
uation of the role of stoichiometry, since real networks are of course not random: 
stoichiometric coefficients enforce chemical balance at each reaction node in the net- 
work. This set of local relations has been shown to give rise, at a larger scale, to 
conservation laws for the aggregate concentration of pools of reagents [53] . Such laws 
are expressed by the relation 

5>?-tf)=u Vi (35) 

where P C {1, . . . M} is the set containing the chemical species that form the con- 
served pool. In concrete, the total number of molecules in a pool is conserved over 
time, so that the aggregate concentration does not change in time (while the con- 
centrations of individual metabolites can change relative to each other). These dy- 
namical invariants have a purely topological origin in the structure of stoichiometric 
coefficients. As a byproduct, one easily understands that metabolites belonging to 
a conserved pool cannot be global network outputs (a detailed theory of this fact 
with applications is developed in (24 25). In other terms, in presence of a conserved 
metabolite pool one expects that p* < 1. 

A first step in driving Von Neumann's model towards real cellular networks con- 
sists therefore in studying how the existence of conserved metabolite pools affects 
the picture derived in the case of random networks. The simplest way to implement 
this ingredient is to impose a constraint on the quenched disorder. Specifically, one 
can introduce a 'random' conserved pool formed by a finite fraction (f> of reagents by 



explicitly adding to the volume defined in (11) a hard constraint of the form 



M 

5>>f-6f) = (36) 
(where z M is a quenched random variable equal to one with probability e and zero 



otherwise). In order to check explicitly that the constraint (36) implies p* < 1, i.e. 
that at best optimal growth states are characterized by constant operation scales for 
reactions (fluxes), it suffices to multiply each term in ^ by z M and sum over p (we 
ignore reversibility). One obtains: 



N M N M 



(i - p) 5> E +E s >E - K) > o (37) 
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Fig. 4. Phase structure ol the random fully connected constrained Von Neumann model. The 
continuous line marks the separation between the regions with g* — (stationary regime) 
and g* < (contracting regime). The expanding regime lies on the e = axis (n > 1) and 
is represented by a green strip 26 . 



If the vector z identifies a conserved pool then the second term vanishes. The first 
term allows for p > 1 only if all reactions connected to metabolites in the conserved 
pool are inactive (i.e. if the pool is effectively removed from the network). In particular 
this leads to the trivial solution Sj = for each i in a fully connected network. Hence 
necessarily p* < 1. So the states of expansion recovered in the previous sections are a 
'pathology' of random systems. When stoichiometric quenched disorder is used they 
cannot be present. Notice that this does not mean that the network cannot have a 
positive net production of compounds. The introduction of ( |36[ ) slightly modifies the 
disorder averaging but the replica theory developed for random input / output systems 
can be conveniently extended to this constrained case [26] . The (e, ri) phase structure 
one obtains, in agreement with the previous observations, is shown in Figure 4. 

A noteworthy consequence of the fact that p* cannot exceed 1 is that for large 
enough n, when an expanding phase would be expected in absence of the conserved 
pool, multiple optima survive, i.e. the average distance between solutions does not 
vanish as g — ¥ g* . So stoichiometric disorder appears to increase robustness in reaction 
networks, since many (microscopic) arrangements of reaction operation scales turn 
out to be compatible with optimal growth. It is interesting to note that real cellular 
metabolic networks typically have values of n larger than one. This suggest that 
the ability to respond to perturbations (e.g. by re-arranging scales) without losing 
the desirable property of maximum productive capacity may be a key ingredient for 
biological stability and resilience. 



4.2 Von Neumann's dual problem and thermodynamics 

Thus far, we have limited ourselves to considering Von Neumann's problem in a pro- 
duction perspective. Our focus has been set on finding ways to operate the available 
processes that ensure self-sustainability and optimality. In his original work however 
J. Von Neumann has given equal attention to the dual problem (in the sense of linear 
programming duality) 2 . In elementary economic terms, it can be introduced upon 
defining an M- vector p of prices (p M > for each /i). The cost associated to running 
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a certain process i at unit scale is simply X^Li-P^f- This generates a corresponding 
revenue given by • The problem dual to (Jib consists in solving 



min a subject to p(A — erB) < (38) 

(T>0 

(compare this with ([l}). The interpretation of the parameter a is not straightforward. 
The simplest is in terms of an interest factor. Imagining that the operation of processes 
is financed by borrowing cash at a rate r, at the end of each period a = 1 + r units 
of cash must be returned to the financiers. In this setting, the conditions 

M 

(a* - <rf£) < V* (39) 

simply mean that the net extra profit generated by each process can at most be zero. 
(Zero profit is a standard scenario in models of economic equilibrium, see e.g. |27j or, 
in a statistical physics perspective, [25].) 

It is simple to show that, in the above settings of |l} and (38) and under broad 



assumptions, p* > a* (with a* the solution of (|38[)) [4j. Furthermore, restricting 
slightly the definition of the model it can be proventhat indeed p* = a* (i.e. that the 
growth factor equals the interest factor, as would perhaps be intuitively expectable). 

It is however tempting to search for a physical interpretation of Von Neumann's 
dual problem. The simplest connection resides in the thermodynamic scenario under- 
lying biochemical activity. It is well known that in non equilibrium steady states with 
non-zero fluxes, reactions must proceed in directions of decreasing free energies (which 
in turn implies that thermodynamically feasible configurations of reaction fluxes may 
not contain directed cycles) [29] . The condition for thermodynamic feasibility reads 

v l AG l < Vi (40) 

where AGi is the (Gibbs) free energy change associated to reaction i and v t is the 
net flux (the difference between the forward and reverse rates) of reaction i. AGi can 
be written as the sum of the chemical potentials of the compounds weighted by their 
respective stoichiometric coefficients. It then follows that thermodynamic feasibility 
is related to the existence of a chemical potential vector g such that 

M 

Vi 5>>f-6f)<0 Vz (41) 



Connection to (39) is now straightforward. In this respect, prices play the role of 



chemical potentials and (39) have the form of a thermodynamic feasibility constraint. 



It would be interesting to construct a more stringent physical analogy for ( 38 ) . 
Quite recently an attempt in this direction has been performed in conjunction with 
Flux-Balance- Analysis (FBA) [9 3(3] . It was shown in specific that the dual to the 
linear programming problem that arises in FBA can be re-formulatcd as a free en- 
ergy consumption minimization problem conditioned on a given free energy drain 
(representing, in the case of cellular systems, growth) [3"T] . 



5 Application to cellular metabolism 



Modeling a cell's metabolic activity is a central problem in systems biology. In prin- 
ciple, knowledge of the network structure and of basic thermodynamic parameters 



14 



Will be inserted by the editor 



like reaction constants allows to formulate non-linear differential equations for the 
enzyme-mediated joint evolution of reactant concentrations and reaction rates (or 
fluxes) that fully account for the kinetics of a biochemical reaction network [29 . Un- 
fortunately at genome scale this route is most often barred by uncertainties about 
kinetic parameters and reaction or transport mechanisms. Constraint-based models 
of metabolism (like Flux-Balance Analysis 9 29 30 32 ) provide a feasible alternative 
that has marked a considerable breakthrough in the quest to reconstruct and sim- 
ulate global cellular functions. In a nutshell, they rely on steady state assumptions 
to impose sets of physically motivated linear constraints (enforcing mass balance) on 
the network nodes corresponding to reagents, and retrieve the viable flux states as 
the configurations of reaction operation scales that ensure that all constraints are 
satisfied. Regulatory and thermodynamic restrictions are included either in the al- 
lowed ranges of variability of fluxes or in the a priori reversibility assignments of 
reactions. Such a constrained linear system suffices to define a space of feasible flux 
configurations (whose characterization is in itself a challenging problem [33 34 ). Quite 
crucially physiologically relevant states are usually assumed to maximize ad hoc ob- 
jective functions that provide a further functional constraint on the cell's metabolic 
activity (see e.g. |35|36|37(38] for some examples). 

Applying Von Neumann's idea to cellular metabolism amounts to taking a rather 
different (though not unrelated) viewpoint. Recalling that in real stoichiometric sys- 
tems p* = 1, finding optimal solutions in the Von Neumann sense means finding flux 
vectors s such that 

N 

c^ = J2«-K) Si >0 (42) 

i=l 



where the index i runs over reactions, // over chemical species and and denote 
the output and input (respectively) stoichiometric coefficients of metabolite /x in re- 
action i. As explained above, ( |42| ) encodes in essence a stability constraint. If fluxes 
that transport matter (e.g. nutrients) into the cell are included in the stoichiometric 
matrices to make the system self-sustainable (in such a way that the nutrient con- 
sumption equals its influx) , then solving ( 42 ) allows to retrieve information about 
what the cell is in principle capable of net-producing in a given environment, as well 
as on the corresponding flux configurations. Metabolites for which c M = indeed turn 
out to be mass-balanced, so that their production and consumption fluxes exactly 
match. On the contrary, if c M > then metabolite fi is globally producible, meaning 
that it can become available cither for macromolecular processes or as waste. The 
availability of a neural network learning-based algorithm for the efficient generation 
of solutions of (42 1 (see based on [35]) then permits a detailed statistical analysis 
of the cell's metabolic capabilities. 



Work performed along these lines for the bacterium Escherichia coli |40I41| has 
shown that environment selection restricts the feasible states according to (42 1 to flux 
configurations that both reproduce well the available empirical evidence on E.coli's 
metabolic fluxes and guarantee that the correct physiological task (in that case, 
growth or biomass production) is carried out by the cell without the need of addi- 
tional ad hoc functional constraints. We want to add here further evidence supporting 
the relevance of Von Neumann's observations in the context of cell metabolism. 

We have computed and analyzed optimal flux configurations (in Von Neumann's 
sense) for the Escherichia coli metabolic network reconstructed in |43] . preparing it 
in a minimal growth medium with gluxose as its main carbon source. At odds with 
the work presented in [40141] . we have modified the model to be able to measure 
the levels of some of the physiological objective functions used in the computational 
biology literature (see e.g. [35]). Our goal is to characterize the solution space defined 
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Fig. 5. Von Neumann-optimal states of the Escherichia coli metabolism. (Left) Distribution 
of the flux overlap (black) and production profile overlaps (red) over distinct solution pairs 
obtained from a set of 250 solutions of (421. (Right) Flux distributions obtained from (42 1. 



Black markers denotes the distribution of average fluxes, the orange curve corresponds to 
the power-law fit with exponent —1 and red markers correspond to a convenient binning 
of the data. Inset: flux distributions in various solutions (different colors). Flux units are 
arbitrary. 




by ( 42 ) from a view point of how disperse solutions are in terms of flux arrangements 
and biological functionality. 

The similarity between two solutions a and /3 with flux vectors s a and sp respec- 
tively is conveniently quantified by their overlap. For each reaction i, we define 

for s ia < e and < e 
if e < s ia < 1 and s lf3 < e 
<I " :{,): in if SM >land^<e ^ 

otherwise 

where e is a cut-off value below which we treat fluxes as zero (results do not depend 
on the choice of the cut-off as long as e is 10 -5 or less). This parameter measures in 
essence the variation of flux i in solutions a and (3. For a complete flux configuration, 
we define 

Qa>f3 = jf^2q a /3(i) (44) 

i 

as a characterization of the "distance" between solutions a and /3: the closer q is to 1 
(resp. 0) the more similar (resp. dissimilar) the two configurations are. Likewise, we 
define the overlap between the vectors c that correspond to the production profiles in 
each solution of ([42]). The distributions of these overlaps are displayed in Figure 5. One 
sees that the distance between solutions has a broader distribution than that between 
production profiles, indicating that a degree of flexibility in the flux organization is 
allowed that nevertheless ensure a remarkable robustness in the emerging production 
profile. The finding of [H] that indeed the most probable output of the network is 
formed by the biomass constituents [44| is hereby fully confirmed (data now shown), 
as is the fact that the distribution of fluxes (see Figure 5) retains the empirically 
observed |45j power-law shape with exponent —1. Single solutions however display a 
certain degree of variability, as can be seen again in Figure 5. 

Let us now analyze how this scenario reflects in the values of some of the re- 
current physiological objective functions used in the literature (see Figure 6). We 
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Fig. 6. Von Neumann-optimal states of the Escherichia coli metabolism. Probability distri- 
butions of the values of commonly employed physiological objective functions obtained from 
the solutions of (42 1: (a) bare ATP yield; (b) bare NADH consumption; (c) total net flux; 
(d) bare glucose consumption; (e) bare NADH consumption per unit of GLC consumed; (f) 
bare ATP yield per flux unit; (g) bare ATP yield per unit of GLC consumed. Plots (1) and 
(2) display the flux pertaining to a biomass production reaction computed in a network re- 
construction where such a reaction was explicitly included. This flux represents an addition 
over the biomass that the network spontaneously yields: (1) bare extra biomass flux; (2) 
bare extra biomass flux per flux unit. Flux units are arbitrary. 



have monitored in particular: the total production of ATP (adenosine-triphosphate, 
the key molecular energy carrier in cells), the total consumption of NADH (nicoti- 
namide adenine dinucleotide, the key reducing agent in electron transfer processes in 
metabolism, with several other important functional roles at different levels in cells), 
the total net flux (which is assumed to be a good proxy for enzymatic efficiency) 
and the glucose (GLC) uptake (measuring nutrient consumption). ATP production, 
NADH consumption and total net flux display a roughly unimodal shape, pointing to 
the fact that under Von Neumann's hypothesis the network is capable of tuning these 
functions within a relatively small range of values. Notice that the total flux is usually 
assumed to be minimized by cells, to account for optimal use of the available pool of 
enzymes. On the other hand, it is reasonable to think that cells under certain condi- 
tions maximize the ATP yield for optimal energetic efficiency. A similar picture holds 
for the ATP yield per flux unit, which is usually assumed to be maximized so as to 
guarantee optimal energetic efficiency at minimum enzyme usage. NADH consump- 
tion is instead thought to be minimized, to reduce the activity of redox processes. 
GLC consumptions displays instead a somewhat broader profile. This is consistent 
with previous results [JT] and points to the fact that the emerging physiologic scenario 
(which appears to be robust from the energetic and enzymatic efficiency viewpoints) is 
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Fig. 7. Von Neumann-optimal states of the Escherichia coli metabolism. ATP yield (inset) 
and ATP yield per flux unit (main) versus extra biomass yield in states where an extra 
biomass flux is forced in (421. 



indeed recoverable under widely varying GLC consumptions. In other words, glucose 
availability can fluctuate within a broad range without affecting optimality in the Von 
Neumann sense. This however does have an effect on other physiological observables 
like the ATP production or the consumption of NADH per unit of GLC consumed. 
These functions turn out to be broadly distributed as well, although with well defined 
peaks at low values. Finally, we have considered the effect of adding explicitly the 
biomass production reaction to the pool of processes encoded in the network recon- 
struction. Recalling that the system globally outputs biomass constituents even in 
absence of this additional reaction, Figure 6 shows that the extra biomass flux due to 
the inclusion of the additional process is essentially null. In other words, the unforced 
biomass output of ( 42 ) by itself fully describes the cell's production capacity in the 



environment we selected. In summary, the physiological scenario emerging from Von 
Neumann's model applied to Escherichia coifs metabolism is one in which the vari- 
ability of optimal flux configurations underlies considerable robustness at the level of 
production profiles as well as at the level of key metabolic functions, the main ex- 
ception being glucose consumption, which instead is allowed to fluctuate over a large 
relative range. The latter aspect contributes to robustness as well. 

To conclude, we have investigated the problem of if and how forcing an extra 
biomass flux from the cell (i.e. forcing a non-zero value for the flux of the biomass 
reaction) changes the physiological scenario with respect to the reference case in which 
the biomass output is obtained self-consistently from the solutions of ( [42] ) . Figure 7 
shows how the bare ATP yield and the ATP yield per flux unit (two elementary 
parameters to quantify the physiological state of the cell) change upon increasing 
the extra biomass flux. In first place, as should be expected, the bare ATP yield 
slowly increases initially as a non-zero additional biomass flux is forced, but its value 
tends to saturate as the latter increases, indicating that the system has reached a 
maximal ATP output capacity. The relative gain in ATP yield with respect to the 
unforced reference state is close to 30%. At the same time, however, the ATP yield 
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per flux unit, which as said above quantifies energetic efficiency against enzyme usage, 
is actually reduced as the biomass flux increases, meaning that the cell can achieve 
larger biomass outputs and a gain in bare ATP yield only at the cost of a dramatic 
decrease in the overall metabolic efficiency (the relative loss being close to 50%). Quite 
significantly then, optimal states from the physiological viewpoint appear to be those 
defined by (42) without any additional constraint. Similarly (data not shown) NADH 
consumption increases upon increasing the forced biomass flux while, by contrast, 
GLC consumption roughly stays constant until the extra biomass flux reaches the 
value 0.5, when it starts increasing dramatically. 

These results suggest that Von Neumann's optimal states are on one hand able 
to provide a description of the metabolic activity of bacterial cells that is in good 
agreement with the empirical knowledge; on the other, they frame key physiological 
observables in contained ranges (implying robustness) so that forcing an improvement 
in the growth capacity leads to a decrease in metabolic efficiency, thereby pointing to a 
direct biological counterpart of Von Neumann's economic optimality. Clearly however, 
further work is needed in order to clarify the extent to which Von Neumann's model 
might be applicable in the context of metabolic networks, and above all for more 
complex cell types (like yeast or human cells), where regulation and kinetics may 
play more important roles and steady state models might turn out to be insufficient 
to identify the emerging physiological characteristics with a good degree of confidence. 



6 Conclusions 



Von Neumann's linear growth model addresses the very general problem of character- 
izing a self-sustained network of interacting input-output processes in terms of (a) the 
maximum achievable growth rate, (b) the pattern of activity of processes and (c) the 
emergent net production profile. Despite its remarkably simple rationale, this setup 
makes for a challenging and easily generalizable statistical mechanics problem that 
gives rise to a rich phenomenology which can be studied both analytically (for ensem- 
bles of random networks) and numerically (for single instances). The theory described 
here covers in our view only the simpler and more straightforward modifications of 
the original model, all guided by a chemical rather than economical intuition. From a 
strictly theoretical viewpoint (but possibly also with direct relevance for many appli- 
cations) it would be interesting to bring them farther beyond Von Neumann's original 
definition, e.g. by relaxing or abandoning the assumption of constant returns to scale. 
Perhaps the most notable application of these ideas outside of economics (where they 
were originally conceived and where they form a basis for the theory of economic 
growth) can be located in cellular metabolism. Applying Von Neumann's growth 
model to cell metabolism, in our view, corresponds to appraising to what extent and 
in what manner do environmental, stiochiometric and thermodynamic factors limit 
and determine the output of a cell's biochemical machinery within a minimal physical 
assumption of stability. The quantitative analysis of intracellular chemical reaction 
networks performed along these lines has provided a physiologically sensible charac- 
terization of the metabolic capabilities of the bacterium Escherichia coli (as well as 
of other, simpler cell types [42]) in agreement with empirical evidence. This suggests 
that the function of the biochemical core of the energetics of cellular systems mir- 
rors at least to some extent the basic notions of stability and optimality that are 
encoded in Von Neumann's model and possibly underlie optimal economic growth. 
Understanding how this picture is affected by the cross-talk between metabolism and 
regulation, which could be the major source of non-linearity, is perhaps the most 
obvious next step. 
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